-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drpolicy controller define #200
Conversation
ee4fc2a
to
9cf5473
Compare
controllers/drpolicy_controller.go
Outdated
case true: | ||
log.Info("create/update") | ||
|
||
controllerutil.AddFinalizer(drpolicy, finalizerName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to check if the finalizer exists.
if !controllerutil.ContainsFinalizer(drpolicy, finalizerName) {
controllerutil.AddFinalizer(drpolicy, finalizerName)
if err := r.Update(ctx, drpolicy); err != nil {
return ctrl.Result{}, fmt.Errorf("finalizer add update: %w", err)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AddFinalizer
checks if it exists before adding it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, but I am talking about the whole block. Your reconciler can be called several times, and it will try to update and createRoles several for as many times as it is called for as long as the object is not being deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand: an optimization to skip the Client.Update()
if finalizer already present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, actually more than that, an infinite loop!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimization to update only if finalizer added has been added. Role creation seems optimal already.
controllers/drpolicy_controller.go
Outdated
return ctrl.Result{}, fmt.Errorf("cluster roles delete: %w", err) | ||
} | ||
|
||
controllerutil.RemoveFinalizer(drpolicy, finalizerName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoveFinalizer
checks if it exists before removing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimization to update only if finalizer removed has been added.
var clusterRolesMutex sync.Mutex | ||
|
||
func (mwu *MWUtil) ClusterRolesCreate(drpolicy *rmn.DRPolicy) error { | ||
clusterRolesMutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need a lock? ClusterRoles are created from the reconciler and I believe that thread safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is my understanding that a hub can have multiple DRPolicies each with at most one reconciler running at a time. The lock serializes concurrent creates, deletes, or creates and deletes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but I am not sure what the lock is protecting. I have to think about it. Ignore my comment for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose drpolicyA specifies cluster1 and is being deleted, while drpolicyB also specifying cluster1 is being created. drpolicyA's reconciler reads all drpolicies and finds no other drpolicies include cluster1 because drpolicyB has not yet been created, so it determines cluster1's roles can be deleted. But before it does so, drpolicyB is created and its reconciler determines cluster1's roles need not be created because it already exists, then drpolicyA's reconciler deletes the roles that drpolicyB expects to exist. The mutex makes the read-modify-write operation atomic.
for i := range manifestworks.Items { | ||
manifestwork := &manifestworks.Items[i] | ||
if manifestwork.ObjectMeta.Name == ClusterRolesManifestWorkName { | ||
*clusterNames = clusterNames.Insert(manifestwork.ObjectMeta.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What the heck this is doing? I can wrap my head around it. It inserting a cluster name (which is MW namespace) into a map of strings but the return value of the Insert is a string which is assigned to *clusterNames???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It outputs a set of namespace names, i.e. cluster names, that have a manifest work named "ramendr-roles". It uses kubernetes' sets.String
which is a set of strings, implemented as a map with string keys and empty values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. So I was looking at this Insert
. So you are not using that one then??? I can double-check, I am just being lazy.
https://github.com/kubernetes/apimachinery/blob/v0.22.0/pkg/util/sets/string.go#L49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it. The confusion the name String
in the return value that threw me off. I thought it is a simple string, but it is typedefed to a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This routine is currently only used by unit test to confirm that only expected cluster roles manifestworks exist.
79498b0
to
208827d
Compare
@@ -595,7 +597,7 @@ func verifyVRGManifestWorkCreatedAsPrimary(managedCluster string) { | |||
return err == nil | |||
}, timeout, interval).Should(BeTrue()) | |||
|
|||
Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(Equal(2)) | |||
Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(BeNumerically(">=", 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't the "ramendr-vrg-roles" count exact? So it should be 2 per managed cluster right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vrg and pvc cluster roles and role bindings are now in the same manifestwork (to simplify management since they have the same lifetime) for a total of 4 manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, test against 4 then. IOW, if the len of Manifests is 6, is that OK? But the expectation will succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think more manifests are ok. This particular test applies only the first two manifests and then verifies a vrg can be created.
controllers/drpolicy_controller.go
Outdated
type DRPolicyReconciler struct { | ||
client.Client | ||
Scheme *runtime.Scheme | ||
// Log logr.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
finalizerCount := len(drpolicy.ObjectMeta.Finalizers) | ||
controllerutil.AddFinalizer(drpolicy, finalizerName) | ||
|
||
if len(drpolicy.ObjectMeta.Finalizers) != finalizerCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this is kind of nit) Can we stick to one pattern when adding and checking finalizers? The following is kind of the same pattern that the DRPC and the VRG are using:
if !controllerutil.ContainsFinalizer(drpolicy, finalizerName) {
controllerutil.AddFinalizer(drpolicy, finalizerName)
client.Update(ctx, drpolicy)
...
}
I am not saying this is the same pattern, but I am suggesting sticking to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I chose this method because len()
of a slice seems to be a constant time operation, whereas Add, Remove, and Contains Finalizer are each linear. It'd be nice if Add and Remove would return a boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that performance is an issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to suggest we stick to the ContainsFinalizer pattern, not because this is incorrect, it is just devaiant from the pattern, and any fixes/updates to controller-util should retain the interface if we follow the pattern. (and as stated this is a nit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I chose this method because
len()
of a slice seems to be a constant time operation, whereas Add, Remove, and Contains Finalizer are each linear. It'd be nice if Add and Remove would return a boolean.
Pull request to address this: kubernetes-sigs/controller-runtime#1636
api/v1alpha1/drpolicy_types.go
Outdated
@@ -74,6 +74,15 @@ type DRPolicy struct { | |||
Status *DRPolicyStatus `json:"status,omitempty"` | |||
} | |||
|
|||
func (drpolicy *DRPolicy) ClusterNames() []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this out of the *types.go file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems, to me, like a logical place for it. Is it a matter of not including it in an external interface? Users may also find it useful and it seems trivial to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Put that function in the implementation file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been moved to ramen's private util
package.
3408cea
to
a935e72
Compare
api/v1alpha1/drpolicy_types.go
Outdated
|
||
return clusterNames | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added two similar DRPolicy methods, which raised the following best-practice related questions:
- Benamar: Should the *types.go file contains implementation other than the Init function?
- Shyam felt that DRPolicy methods are in deepCopy file and did not want any repurcussions due to more methods getting added to the API itself.
Given the above concerns, I changed them from methods to helper functions and placed them in util/drpolicy_util.go.
@@ -595,7 +597,7 @@ func verifyVRGManifestWorkCreatedAsPrimary(managedCluster string) { | |||
return err == nil | |||
}, timeout, interval).Should(BeTrue()) | |||
|
|||
Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(Equal(2)) | |||
Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(BeNumerically(">=", 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear why the ">=" is required instead of "==". It may help to comment the expectation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benamar raised this issue too: #200 (comment). The code immediately following this only references the first two manifests. Perhaps the pv roles manifests ought to be tested like the vrg ones are.
@@ -760,13 +762,13 @@ var _ = Describe("DRPlacementControl Reconciler", func() { | |||
verifyUserPlacementRuleDecision(userPlacementRule.Name, userPlacementRule.Namespace, WestManagedCluster) | |||
verifyDRPCStatusPreferredClusterExpectation(rmn.FailedOver) | |||
verifyVRGManifestWorkCreatedAsPrimary(WestManagedCluster) | |||
Expect(getManifestWorkCount(WestManagedCluster)).Should(Equal(4)) // MWs for VRG+ROLES+PVs | |||
Expect(getManifestWorkCount(WestManagedCluster)).Should(Equal(3)) // MWs for VRG+ROLES+PVs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) This commit is not a related to this PR and not addressed to @hatfieldbrian. When we get a chance in future, I suggest that instead of using hard-coded values, can we use a variable that tracks how many manifest work is expected? The variable could be incremented when new mw is created and decremented when a mw is deleted.
It("should create a cluster roles manifest work for each cluster added", func() { | ||
}) | ||
It("should delete a cluster roles manifest work for each cluster removed", func() { | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two It
s to be implemented later? If yes, a TODO comment may help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO added
controllers/suite_test.go
Outdated
Expect((&ramencontrollers.DRPolicyReconciler{ | ||
Client: k8sManager.GetClient(), | ||
Scheme: k8sManager.GetScheme(), | ||
// Log: ctrl.Log.WithName("controllers").WithName("DRPolicy"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why comment out the log "DRPolicy" qualifier? Won't the qualifier be useful when the log is output during test failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These log qualifiers are added in DRPolicyReconciler.Reconcile()
instead which is common to main.go
and _test.go
instances.
e4b57a2
to
f5c91be
Compare
A bug in drpolicy delete unit test was fixed and it now passes. |
f5c91be
to
84306fe
Compare
|
||
if err := manifestWorkUtil.ClusterRolesCreate(drpolicy); err != nil { | ||
return ctrl.Result{}, fmt.Errorf("cluster roles create: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned here (https://github.com/RamenDR/ramen/blob/main/controllers/drplacementcontrol_controller.go#L332-L335), it may be better to incorporate the task of validating the schedule in DRPolicy instead of DRPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Raghu. #201 tracks this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hatfieldbrian rebase to main and any comments that you can tackle are good to get this merged.
finalizerCount := len(drpolicy.ObjectMeta.Finalizers) | ||
controllerutil.AddFinalizer(drpolicy, finalizerName) | ||
|
||
if len(drpolicy.ObjectMeta.Finalizers) != finalizerCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to suggest we stick to the ContainsFinalizer pattern, not because this is incorrect, it is just devaiant from the pattern, and any fixes/updates to controller-util should retain the interface if we follow the pattern. (and as stated this is a nit)
46b229f
to
8267b48
Compare
- drpolicy create/update: - finalizer add if absent - cluster roles manifestwork create, if absent, for each cluster in drpolicy - drpolicy delete: - cluster roles manifestwork delete for each cluster not in any other drpolicy - finalizer remove if present - gomega version increase from 1.14.0 to 1.15.0 Signed-off-by: bhatfiel <bhatfiel@redhat.com>
8267b48
to
30c4bfb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hatfieldbrian looks like CI is failing due to go.sum updates
@BenamarMk or @vdeenadh if you can take a look and add an approval we can merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShyamsundarR already approved. We'll merge once the build is complete
Signed-off-by: bhatfiel bhatfiel@redhat.com